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ABSTRACT 



Researchers require benchmark test problems to evaluate 
the speed of computer codes designed to solve minimum cost 
network flow problems. To date/ the only universally avail- 
able test problems developed for that purpose are randomly 
generated. In practice, however, real-world network problems 
solve faster than random network problems. This thesis 
examines the effect on solution time resulting from applying 
structure, produced through simulation of real-world phenom- 
ena, to test networks. An efficient computer code, VSGEN, is 
developed which generates structured transportation and multi- 
echelon networks. Various types of structure, including unit 
flow cost, network topology and arc capacity, reduced the time 
required to solve the test networks an average of 26%, when 
using a primal network simplex solver, GNET. 

The parameter Big M used in primal simplex algorithms may 
affect solution times differently in structured versus 
unstructured networks. VSGEN is used to investigate this 
possibility. A bound on the minimum Big M is first developed 
for bipartite networks. This bound is sharper than the 
default bound used in GNET, but it does not reduce solution 
times in either structured or unstructured problems. Even the 
best possible bound reduces solution times by only 10%, on 
average . 
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INTRODUCTION 



I . 



A. BACKGROUND 

Modern high speed computers have dramatically increased 
our ability to solve the large minimum cost network flow 
problems arising in industrial, governmental and military 
settings. The minimum cost network flow problem is the 
problem of transmitting a given supply of a single commodity 
through a network to meet a specified demand at the lowest 
cost. Flow is directed through the network on arcs with 
linear cost functions which represent the effort required to 
transmit flow on a given arc. These problems arise through 
many sources including inventory, scheduling, distribution, 
assignment, and other problems. The broad applicability of 
research on this topic has resulted in numerous competitive 
computer codes which solve the minimum cost network flow 
problem. Users and developers of these codes require the 
ability to compare competing algorithms and codes for 
problems with a variety of structures. Accurate comparisons 
can result only from testing the competitors on a set of 
standard test problems, but up to now, only one attempt has 
been made to derive such a set of test problems. This 
attenpt is a network generator code, NETGEN, written by 
Klingman, Napier, and Stutz [Ref. 11] in 1974. Since their 
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initial work, a large number of articles, proprietary 
network solution codes, and texts have used networks gener- 
ated by NETGEN as benchmark problems [Refs. 4, 8, 9, 14], 

NETGEN constructs test problems of three types: 
transportation networks, assignment networks, and general 
minimum cost flow networks. The problems generated are 
essentially unstructured; the user controls the maximum arc 
cost, percentage of arcs with bounds on capacity, the number 
of source nodes, the total number of nodes m and the total 
number of arcs n, but the scheme by which the nodes are 
interconnected, the cost assigned to each arc, and the 
distribution of supply and demand are completely random. 
NETGEN has given network flow research a standard set of 
test problems, but whether these random test problems 
accurately reflect the performance of competitive codes of 
real problems is questionable. 

The original NETGEN paper acknowledges that problem 
structure influences the effectiveness of different 
algorithms. Bradley, Brown and Graves [Ref. 2] specifically 
state that their code, GNET, "solves real network models 
faster than random NETGEN problems of nominally comparable 
size and structure, suggesting that much remains to be 
learned from further investigation of special problem 
structures." More recently, in discussing shortest path 
problems, a special case of minimum cost network flow 
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problems. Dial et. al . [Ref. 4] indicates that "the most 
efficient solution procedure depends on the topology of the 
network and the range of the arc length coefficient." The 
purpose of this thesis is to explore the range of effects 
that certain types of network structure have on speed of 
network solution codes, and to develop a network generator 
which allows researchers to exercise their algorithms and 
codes on networks exhibiting different types of structure. 

Real-world networks contain various types of structure 
which are readily apparent. For example, distribution 
networks sometimes exhibit a geographic echelon structure 
which has special topologies and dependencies among the flow 
costs. A hypothetical example of this might be a distribu- 
tion systen in which a commodity enters the continental 
U.S.A. through ports on the west coast. From the warehouses 
surrounding the ports of entry, the good is shipped to 
retail outlets in various regions across the country, but to 
reach the furthest region, the good must be shipped via 
distribution centers in intermediate regions (echelons). 
Assuming the cost of shipping is proportional to the 
distance over which it must be shipped, the distribution of 
costs inherits structure from the distribution of 
destinations . 

Structure in a network model may also exist as a result 
of "gravity" modeled in traffic engineering [Ref. 16]. The 
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gravity model can determine with surprising accuracy the 
amount of rail traffic, commercial trucking, or even usage 
of the telecommunications networks between two cities. The 
gravity model is; 

(Population of city A) x (Population of city B) Amount of 

Distance between cities A and B trade between 

cities A and B 

Although this model represents a phenomenon of industry 
rather than a well-defined physical law, it does imply a 
certain amount of structure in industrial network flows. 

Given that real-world problems are more accurately 
represented by test problems with simple structural 
assumptions, exploratory research needs to be done to 
ascertain which types of structure affect solution 
efficiencies and in what manner. The author chose to 
investigate patterns in cost, topology, and geographic 
echelons in this thesis. The number of variations on 
network structure is limited only by the imagination, so for 
this study, several representative selections are made. The 
intent here is not to exhaust all possibilities, but to gain 
insight into the design of test problems which may more 
accurately reflect performance of network flow algorithms on 
real problems. 



10 



If these new networks exhibit advantages over random 
test networks, then a decision must be made as to whether a 
new set of standard problems should be distributed or 
whether the generator code itself should be distributed, 
allowing researchers to produce their own networks. The 
advantages to a single set of test problems are obvious. A 
standard group of benchmarks encourages comparisons on 
identical networks and gives users reliable reference 
points. Additionally, transferring data sets on magnetic 
tape rather than codes essentially eliminates problems of 
machine independence and guarantees reproducibility of 
results. However, providing a comprehensive set of problems 
for the countless types of structure is impossible. Passing 
of a generator to fellow researchers would allow them to 
tailor test data to algorithms which may be written with 
specific networks in mind. This article presumes that until 
algorithm developers decide on what a representative group 
of structures is, it is better to allow researchers to 
select their own network structures. 

B. COMPUTATIONAL METHODOLOGY 

1 . The Bounded Variable Primal Simplex Algorithm for 
Networks 

The minimum cost network flow problem may be viewed 
as a specialized linear program. 
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Minimize cx 

subject to Ax = b 

0 < X j< u 

where A is a node arc incidence matrix with exactly 
one +1 and one -1 in each column. 

Several methods are presently available for the solution of 
large minimum cost network flow problems including the 
primal simplex, dual simplex, primal-dual, out-of-kilter, 
and more [Refs. 2, 5, 13]. One of the most efficient primal 
network simplex algorithms is developed by Bradley, Brown 
and Graves in their GNET code [Ref. 2]. Because of its high 
speed and reliability, this code shall be used to compare 
the solution times of the structured networks produced in 
this thesis and the random networks created by NETGEN. 
Summarizing from the GNET paper, the primal simplex methods 
solves the linear programming problem in the following 
manner . 

Manipulation of the matrix A (with addition of unit 
vectors, representing artificial variables) may allow A to 
be partitioned into A = (B,N) , where B is an m x m matrix of 
linearly independent columns called a basis. Given a basis 
B, there will exist a unique x such that 

B X = b (1) 
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If X > 0, 
o 

IS X = 

tioning c 



a basic feasible solution to the original problem 

A 

Q . Assume such a solution is known. Parti- 
in the same manner as A results in 



cx 



o 



(c 










= CgX 



( 2 ) 



A solution which satisfies the constraints of the original 
problem may be written 



Since B 

ft 

that 



is 



Ax = (B,N) I X 



a basis, there ex 



= BXg 
ists a 



+ NX., = b 
N 

transformat ion 



Z 



(3) 

such 



BZ = N 



(4) 



Algebraic manipulation of equations (3) and (4) yields 



B(x - Zx^) + NXj^ = b 



(5) 



The general solution to equation (5) is then 



X = 





( 6 ) 



In this form 
the current solution 



the value of x is easily compared to 
x*^ and improved solutions are readily 
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identified when they exist. From equation (8) and the 
original objective function 



CX = CgX + (c^ - UN) Xj^ 



(7) 



where u, called the 
multipliers," is the 



"dual variables" or "simplex 
solution of 



uB = Cg (8) 

From (7) and the constraint x^^ = 0, a necessary condition 
for an improved solution is that there exist a colamn of 
N, N^ such that 

Cj^ - uN^ < 0 (9) 

Given there is at least one column in N corresponding to 
the variable x , a candidate variable chosen from all those 
satisfying (9) for entry into the basis, and a basic 
variable is selected to exit the basis by way of the ratio 
test [Ref. 2]. Having selected the variable for entry into 
the basis, an algebraic process called pivoting is performed 
to exchange the columns corresponding to the entering and 
exiting variable. If inequality (9) is not satisfied by a 
non-basic variable, then optimality has been achieved. 
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otherwise/ the search for an improved solution continues and 

another iteration is performed. 

The network specialization of the primal simplex 

algorithm exploits the fact that all bases from A can be put 

into upper triangular form and represented very compactly. 

k k 

In the standard simplex algorithm BZ = N in (4) is solved 

for Z by computing Z = B N / and uB = Cg in (8) is solved 

for u by computing u = CgB This requires the expensive 

storage and updating of B ^ at each iteration. In contrast/ 

k k 

with B in upper triangular form/ BZ = N and uB = Cg can be 
solved directly via back substitution and forward substitu- 
tion/ respectively. The representation of B in 0 (m) space/ 
which is used instead of a full m x m matrix/ speeds these 
solutions in addition to being space-efficient. Further 
advantages of the network simplex method are that all- 
integer arithmetic can be used if c and b are integer/ and 
that specialized network data structures allow very 
efficient pivoting/ retriangulation of the basis and 
updating of solutions. 

2 . Procedure for Comparison 

The comparison used to evaluate the effects of 
various types of structure on solution of the minimum cost 
flow problem is the time required to achieve optimality. 
Although the number of pivots required to achieve optimality 
could also be used to compare these effects/ the reliability 
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and consistency of that measure is highly suspect. All 
pivots are not of equal difficulty (difficulty defined as 
the number of machine operations required to complete a 
pivot in a computer implementation) . It is possible for a 
solution which requires many pivots to be obtained in a 
shorter amount of time. (For example, see Goldfarb and Reid 
[Ref. 6] and their experiments with the " steepest-edge" 
variant of the primal simplex algorithm applied to general 
linear programs.) Time costs the computer-user money and is 
therefore a pertinent measure and is the measure which will 
be used for comparisons in this thesis. Unfortunately, time 
comparisons are difficult on a virtual memory machine like 
the IBM 3033AP since there may be significant variation in 
run times for identical problems. To minimize this effect, 
the time comparison is accomplished by solving each test 
network five times and noting the mean of the five solution 
times. The code used to solve the test problems in all 
cases was GNET. Solution by a single code permits accurate 
comparison of the solution times with regard to the 
influence of structure, but is not necessarily indicative of 
the performance of all algorithms on such problems. Future 
analysis should include other algorithms to reveal the 
influence of structure on those methods as well as the 
primal simplex method. 
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The test networks generated ranged in size from 200 
to 4,000 nodes and from 1300 to 20,000 arcs. For each size, 
NETGEN was used to produce a random network version and each 
of two structured network generators was used to construct 
problems of various structural types. The first generator 
produces networks structured with respect to cost only, but 
the second generator yields networks with a variety of 
structure including structured supply, demand, cost, 
capacity and topology. 

C. THE INFLUENCE OF BIG M ON SOLUTION TIME 

A Structured network generator is a tool for evaluating 
minimum cost network flow solution techniques. Other 
network research has indicated Big M [Ref. 1] is a parameter 
of the solution technique which can affect solution time 
[Ref. 7]. The version of GNET used in this research 
utilizes the Big M variant of the primal simplex method. In 
Chapter IV, an experiment evaluating the effect of Big M on 
solution times of network flow problems is performed which 
compares networks generated by the structured generators and 
by NETGEN. To facilitate discussion of Big M, the network 
linear programming problem shall be defined as before; 

(P) Minimize cx 

Subject to Ax = b 
x > 0 
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i 



An initial basis matrix is required to initiate the simplex 
solution method used in existing network algorithms. If 
such a matrix is not readily apparent in the A matrix, an 
artificial vector, x^, is introduced to give a convenient 
starting point for the simplex method. When an artificial 
vector is introduced, the initial basic feasible solution is 
given by x, = b and x = 0. Modification of the constraints 
requires modifying the objective function to reflect large 
penalties for non-zero values of the artificial variables in 
the problen solution. The new problem produced by these 
changes is as follows: 



P (M) 


Minimize 


cx + Mx, 

3 




Subject to 


Ax + Ix = b 

3 



X, x^ > 0 

where M is a very large number representing the 
penal ties , 

In the network simplex method, these penalties are 
assigned only to variables associated with flow into sink 
nodes. Even though x is a feasible solution to P(M) , the 
design of the simplex method will force the artificial 
variables to zero in a search for the optimal solution to P, 
is such a solution exists [Ref. 1]. 
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The disadvantage of the Big M method lies in attempting 



to select a value for Big M a priori . Too large a value 
will dominate the other cost coefficients in the objective 
function and may result in serious round-off errors in a 
computer or, in the case of the network simplex algorithm, 
problems with representation of large integers. However, 
too small a value will not force all the artificial 
variables to zero. In searching for the appropriate value 
for Big M, one must also be aware of the time lost to 
locating exactly the right value. It is believed that a 
close upper bound on the minimum acceptable Big M will be 
sufficient to reduce the CPU time required to run the 
simplex method without wasting time fine-tuning the estimate 
of Big M. In Chapter IV, a bound in bipartite networks 
based on the minimum and maximum cost arcs leaving the sink 
nodes is developed for this purpose. 

An alternative approach to investigating the effects of 
Big M is also used. A dual formulation of P(M) is as 
follows : 



Max imize 
Subject to 

The second 

i .e . , M is 



T 

I u^ £ M 

set of constraints implies: 
u £ M 

an upper bound on the dual variables. 
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From this it is also true that if the optimal solution is 

known, then one can say that the max u^ would have been an 

i 

excellent estimate for Big M. Although in Chapter IV this 

research attempts to determine and use a sharper bound on 

Big M prior to finding the solution, an analysis of M = 

max u^ dual variable in previously solved problems, may 
i 

produce insight into possible differences in solution- time 
behavior of structured versus unstructured problems, and 
might lead to better estimates for Big M in future problems. 

Several structured and unstructured networks of 
comparable sizes were tested to examine the change in 
solution time resulting from changes in Big M values. In 
each instance, the test problems were first solved using the 
maximum dual value previously obtained, and then solved 
numerous times with incrementally larger values of Big M 
until solution time did not change any further. The results 
of this investigation are included in Chapter IV. 
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II . NETWORKS WITH STRUCTURED COSTS 
A. PHILOSOPHY 

The most basic approach to the problem of generating a 
structured network is to create a network exhibiting 
structure in a single aspect, e.g., a network with random 
costs, random capacities, random supplies and demands, but 
structured topology. In this way, changes in solution 
efficiencies can be investigated with respect to changes in 
a single, isolated type of structure. Thus, a simple scheme 
to generate "singly" structured networks might be to take 
the feasible but random networks produced by NETGEN, and 
modify these networks to exclusively structure costs, or 
supplies and demands, or capacities, or topology. 

NETGEN usually produces feasible networks, which is 
desireable, but it produces random costs, capacities and 
topologies. This chapter details initial attempts to 
produce networks structured in a single aspect. However, 
due to the generation methodology utilized in NETGEN, this 
is not easily done except with respect to costs. 
Consequently, using cost as the single structural aspect, 
arc flow costs are structured to simulate those costs which 
might occur in a physical distribution network. In such a 
system, arc flow cost is often a function of the distance 
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between the arc's head node, j and tail node, i [Ref. 12]. 
For k=l and p=2. Euclidean distance becomes a special case 
of a function developed by Love and Morris [Ref. 12], 

d.j = k [(X. - + (y^ - yj)P]^'^P 

which an be used to estimate the actual road and shipping 
distances between two points. In this chapter, arc cost is 
then made a simple linear function of this distance since 
this seens to represent real-world structure in certain 
instances [Ref. 10]. 

A FORTRAN program, TRANS, was developed to take the arcs 
listed in the SHARE formatted output from NETGEN and replace 
arc costs with costs exhibiting the structure described 
above. The only user-defined input is the length to width 
ratio (r:l) of the rectangle into which the nodes are 
placed. TRANS assigns an (x,y) coordinate to each node 
generating y as a uniform (0,1) random deviate and x as a 
uniform (0,r) random deviate. (The uniform random number 
generator used for this purpose is the LRND portion of 
LLRANDOM II, a machine specific random number generator 
developed at the Naval Postgraduate School [Ref. 14]. Arc 
costs are then created by scaling the Euclidean distance 
between head and tail nodes to lie between 0 and the user- 
defined value, maximum cost. The output from TRANS is 
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identical to that from NETGEN with respect to node-arc 
connections arc capacities, and supplies and demands. 

Because the networks generated in this way are structured 
with respect to cost only, and other aspects remain random, 
these networks shall be referred to as "pseudo-structured" 
networks . 

After the pseudo- structured networks are generated, GNET 
is used to solve each network five times and the mean 
solution time was recorded. 

Three variations of the basic structure were produced. 
Nodes were randomly placed in a square and in rectangles 
with length to width ratios of 3:1 and 20:1 in an attempt to 
determine if any of the shapes and the resulting structures 
would significantly affect solution times. If any substan- 
tial change was observed, then further structure in that 
direction could be explored. 

B. RESULTS 

A representative sampling of the computational results 
for the pseudo- structured networks and NETGEN problems of 
similar size are contained in Table I. As evidenced by 
these values, the variation in the solution time ranged from 
34% less time to 37% more time required to solve the TRANS 
networks than the NETGEN networks. In most cases solution 
time appeared comparable. The small and inconsistent 
changes observed here indicate that if structure does affect 
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Solution Times for Pseudo-Structured Networks 
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solution times of network algorithms significantly, the cost 
structure utilized by TRANS is inadequate to exhibit this, 
or the reduction in solution time is not achieved through 
cost structure alone. 
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III. 



A STRUCTURED NETWORK GENERATOR 



A. INTRODUCTION 

Pseudo- structured costs alone do not reveal any 
advantage to structured test problems over randomized 
networks. However, by providing networks with more profound 
and complex structure, which more closely approximates 
structure found in real-world networks, a reduction in 
solution times may be accomplished. Characteristics which 
can be structured include supply and demand, arc flow 
capacities, echelon structures of the nodes, and a wide 
range of indegrees (number of incoming arcs) and outdegrees 
(number of outgoing arcs) for individual nodes. This 
chapter addresses the development of a completely new 
network generator which provides various attributes of 
structure to the feasible (or infeasible, if desired) 
transportation and multi-echelon test problems which it 
creates . 

Ideally, a structured network generator would provide 
the user with the ability to choose any one of several 
alternatives for the amount and type of structure in the 
desired network. These alternatives might include the 
following : 

(a) type of network (transportation, assignment, multi- 
echelon) 
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(b) 



number of capacitated arcs 

(c) tightness of capacity constraints 

(d) number of node echelons 

(e) number of nodes 

(f) types of nodes (source, sink, transshipment source, 
transshipment sink, pure transshipment) 

(g) amount of supply and demand 

(h) choice of distributions with which supply, demand, 
and costs are allocated. 

Generation of test problems may be a major expense in 
the testing of minimum cost network flow solvers. For 
instance, NETGEN problems can take about five times more 
computer time to generate than to solve with GNET [Ref. 2]. 
Thus, another important design criteria for this generator 
is efficiency in regard to computer time and storage 
requirements. Random number generation can be extremely 
time-consuming. The process of constructing test networks 
requires generation of random numbers, which can be very 
time consuming. Therefore, it is imperative that efficient 
methodology be used in random number generation. In this 
vein, it is better (faster) to create the random numbers in 
large groups and store them in an array rather than to call 
a random number subroutine each time another number is 
required. (This is the technique used in NETGEN.) The very 
nature of large industrial networks implies that test 
problems designed to simulate such networks will require 
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considerable computer storage. However, it is not necessary 
to store all the information generated. The arcs can be 
written directly to data files eliminating the need for arc- 
length arrays. Additional savings can be obtained by using 
arrays for several purposes rather than creating new arrays 
for each new requirement. A good example of this is reusing 
the arrays in which the random numbers are stored. 

Another factor affecting time usage is the number and 

types of operations performed. A large portion of the 

generation time in network programming would be contained in 

determining flow patterns. A possible scheme for simulating 

network flow would be to generate a distribution to_^ 

determine the likelihood and amount of flow between nodes. 

These distributions could be based on node attributes such 

as size, location, and many others. However, empirically 

generating such distributions by examining all possible 

2 

pairs of nodes implies performing 0(m ) operations. To 
maintain the generator's efficiency, a compromise is 
required between achieving real-world structure and 
generation speed. Performing OCm*") operations is extremely 
time-consuming and may require many orders of magnitude more 
operations than a method which requires 0(n) operations, n 
being the number of arcs. A more efficient method would be 
to pick head and tail node by some other method, even 
randomly (a loss of some structure seems unavoidable) , and 
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determine flow based on that choice. The number of required 
operations is reduced to 0(n) in this way. 

It would be impossible to provide a network generator 
that could produce every possible type of structure. A more 
realistic approach is to build a framework from which 
researchers can develop test problems which meet specific 
needs. The framework needs to be general enough to be able 
to readily accept user supplied subroutines for structure 
beyond the capability of the basic generator. VSGEN is one 
implementation of a framework that meets the requirements 
set forth here. It is a FORTRAN program designed to 
efficiently create structured transportation and multi- 
echelon networks. Storage and time requirements are 
considered in the development, and the procedure used easily 
allow expansion. 

1. Detailed Implementation 

The methodology used in VSGEN is uncomplicated but 
effective. The program generates structure through the 
simulation of some real-world phenomena. Each node is 
assigned a set of attributes which include their rank, and a 
population based on that rank and location. The method for 
assigning node location developed in TRANS is also used 
here. Populations are assigned by using a phenomenon known 
as Zipf's Law [Ref. 16]. 

(population) x (rank) a (constant) . 
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since the node with the largest population will be ranked 
one, this relationship indicates that the constant is 
approximately equal to the maximum population over all node 
populations. The maximum population used in VSGEN is 
min {100 x m, 100,000), The ceiling of 100,000 is utilized 
to prevent difficulties with large integer arithmetic on the 
computer and the associated storage problems. Each node's 
population is defined by a random variable which is normally 
distributed about a mean of maximum population divided by 
node rank. The standard deviation associated with each 
population is assumed to be one-tenth of the node popula- 
tion. The LNORM portion of the LLRANDOM II random number 
generator [Ref, 14] was used to produce the necessary normal 
random variables. 

The total supply is then distributed among the 
supply (source) nodes based on the population at each 
source. This pattern for supply allocation is used because 
it is reasonably assumed that nodes with larger populations 
have larger supplies in real-world networks. The portion of 
supply at source i is determined by 



supply at source i = ( total supply) ( population at source i) 

total of source node populations 



the 



After 
fo llowi ng 



supply has been allocated, the 
methodology to build networks. 



program uses 
Multi-echelon 



30 



(more than two echelons) networks are created by concaten- 
ating transportation networks, i.e., two-echelon networks 
together. Thus the transportation network subroutine is 
called (k-1) times by the multi-echelon routine to create 
k-echelon networks. 

B. TRANSPORTATION NETWORKS 

The transportation (two-echelon) algorithm proceeds by 
first determining the outdegree for a given source node in 
proportion to (1) the total number of arcs, and (2) the 
square root of the supply at that source. The relationship 
used is based on 



outdegree at source i 



(total no. 



of arcs) X ( supply at source 
total supply 




The reason for the use of square root of supply at source i 
is speculation by the author. If the outdegree is made 
directly proportional to the supplies, then in smaller 
problems the nodes with small supplies are occasionally 
assigned an outdegree of only one or two. This seems 
unlikely in the structure of the real world. Making 
outdegree a function of the square root of the supplies 
effectively reduces the severely skewed nature of the 
distribution of arcs and produces more intuitively appealing 
results. Additional constraints on the number of arcs 
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originating at any source prevent the outdegree from being 
less than one or greater than the number of sinks. Having 
at least one arc coming from each source helps insure 
feasibility; precluding the number of arcs from being 
greater than the number of sinks helps reduce the number of 
parallel arcs, i.e., the number of arcs with the same head 
and tail nodes. 

After the number of arcs emanating from a source is 
determined, each of those arcs is randomly assigned a sink 
by choosing a number from the discrete uniform distribution 
on (l,m2) where m^ is the number of sinks. As each 
destination is chosen, the arc cost is determined as a 
function of the Euclidean distance between source and sink 
node. The results of using the Euclidean distance did not 
reveal any gain in solution times over uniformly distributed 
costs. In an attempt to determine if other cost structures 
would produce reductions in solution times, cost was made 
proportional to the square root of Euclidean distance to 
simulate a distribution system with decreasing marginal cost 
per mile. 

The gravity model discussed in Chapter I is utilized 
myopically in determining the additional demand to be 
placed at the newly chosen sink node. The population at 
the source and sink are multiplied, and the product is 
divided by the Euclidean distance between the nodes. This 
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I 



I 



value and the total supply at the current source are then 

combined to define the proportion of flow merited in the 

2 

current arc. The methodology used here avoids the 0(m ) 
operations which would be required to empirically determine a 
flow distribution as previously discussed in Section III-A. 

At this point, only arc capacity remains to be defined. 

Arc capacity is determined by multiplying a user-defined 
value by the amount of flow just obtained. This feature 
allows the user to control the "tightness" of the arc 
capacities, and consequently, to explore the effect of 
capacity upon solution time. For generation of feasible 
networks, the only requirement is that the capacity 
multiplier must be greater than or equal to one. Values 
less than one will result in infeasible networks because arc 
capacities will not allow enough flow to satisfy demand. 
Actually, the ability to create networks with varying 
degrees of infeasibility is a useful property of VSGEN. 
Infeasible problems are not uncommon in practice and the 
testing of new solution codes should include infeasible 
problems . 

Arc costs and capacities are created in the above manner 
until all supply has been allocated. Integer truncations 
occasionally prevent a small percentage of the requested 
number of arcs from being generated, but in all cases the 
total supply is completely distributed. Accurate monitoring 
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T 






of the amount of supply remaining also precludes the 
generator from allocating more than the total amount of 
suppl y . 

C. MULTI -ECHELON NETWORKS 

Individual echelons in the multi-echelon networks are 
generated in the same manner as those in the two echelon 
networks. Each node is assigned population attributes 
exactly as before. Arc costs and capacities are assigned 
using the previous methodology also. The difference between 
producing two echelon and multi-echelon networks occurs in 
generating the location of each node and in the number of 
arcs between each echelon of nodes. 

In this research, location attributes are assigned using 
two separate procedures. The first method assigns positions 
to all nodes inside one rectangle, regardless of echelon, 
just as TRANS did in Chapter II. The second method for 
assigning locations allows the researcher to evaluate any 
reduction in solution times available through a geographic 
echelon structure. The total area over which the network is 
defined remains unchanged. Source nodes are located at one 
end of the rectangle and sink nodes at the opposite end. 
Those nodes account for two of the requested echelons in 
each problem. The other nodes (transshipment nodes) are 
assigned to regions in the interior of the rectangle between 
sources and sinks. The region size is equal for all 
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echelons including those of supply and demand, and conse- 
quently, the width of a region is inversely proportional to 
the number of echelons requested by the user. The height of 
each region remains constant for all cases. 

Flow progresses through the network from one echelon to 
the next with flow permitted only between adjacent echelons. 
Each unit of flow must transit through all echelons 
sequentially; i.e., no echelon may be bypassed and flow does 
not backtrack into previously transitted echelons. The 
feature is patently different from the transshipment 
procedure utilized in NETGEN. Although NETGEN allows the 
user to request transshipment nodes, the generator does not 
treat those nodes as belonging to a set of one or more 
echelons. In proceeding from a pure source to a pure sink 
node, flow may pass through transshipment nodes, but it is 
not required to do so. The totally random nature of NETGEN 
allows flow on arcs between any two nodes except between two 
pure sources or two pure sinks. Using such a scheme for 
network generation precludes analysis of geographic echelon 
structure . 

The manner in which flow is directed through each net- 
work generated by VSGEN is simple. Although the number of 
nodes in each echelon is user-defined, the number of arcs 
between echelons is not. For simplicity, that number is 
assumed to be proportional to the product of the number of 
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nodes in the two adjacent echelons, and the total number of 
arcs to be generated. The generator then utilizes the 
number of arcs between echelons as input to the subroutine 
used to create a two echelon network. All flow in the 
current echelon is passed to the next echelon before any 
flow is passed on to subsequent echelons. The current and 
immediately subsequent echelons are treated as a two echelon 
network unto thenselves; the current echelon being the 
supply nodes and the subsequent echelon being the demand 
nodes. When the flow between those two regions is complete, 
the subsequent echelon is then designated as the supply 
echelon and its successor is designated the demand echelon. 
This process is continued until flow has passed completely 
through the network to the true demand nodes. 

D. OUTLINE OF VSGEN 

This section specifies the required input to VSGEN and 
outlines the VSGEN algorithm. The ability to structure the 
test networks in several aspects results in slightly more 
complicated input compared to NETGEN. Table II shows the 
required input for VSGEN. 

The output format utilized is the SHARE format, the same 
as that produced by NETGEN, because this is probably the 
most widely used network format [Refs. 3, 5, 11, and 15]. 

VSGEN, the algorithm for creating structured 
transportation and multi-echelon networks is outlined as 
follows : 
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VSGEN Input Specifications 
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VSGEN 



Input: NNOD = Total number of nodes 

NARC = Total number of arcs 
ITSUP = Total supply 

MAXCST = Maximum allowable cost on any arc 
MINCST = Minimum allowable cost on any arc 
ITYPE = Type of network; 2 = transportation 

13-19 = multiechelon (3-9 echelons) 
LECHELN(e), e=l , . . . , NECHELN , = Number of nodes in each 

echelon e 

CAPMUL = Capacity multiplier, CAPMUL > 0 

ICAP = Capacitated network indicator; 0 = uncapacitated 

1 = capacitated 

Output: Feasible network in SHARE format if CAPMUL ^ 1 or 

ICAP = 0, else an infeasible network in SHARE 
format . 



(1) Read input. 

Define N = set of nodes, i = l,...,NNOD 

Ng = set of LECHELN(e) nodes in echelon e, 
e=l, . . . ,NECHELN 

RATIO = Ratio of x to y in rectangle containing 
nodes 

If ITYPE > 13, NECHELN = ITYPE - 10 
Otherwise, NECHELN = 2. 



(2) Assign node attributes 



(a) (i) If a geographic echelon structure is used, 

then for each echelon e, and for each node 
ieN , randomly assign coordinates X(i) and 
Y(if in rectangle bounded by the coordinates 



((e-1) X 



RATIO 1 
NECHELN J 




^(e-1) X 



RATIO ] 
NECHELNj ' 




X 



RATIO ] 
NECHELN J 




e X 



RATIO 

NECHELN 



/ 
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and let MAXDIST 




RATIO 

NECHELN 



2 



+ 




1/2 



(ii) Otherwise, for each node ieN, randomly assign 
.coordinates X(i) and Y(i) in rectangle bounded 
by the coordinates 



(0,0), (0,1), (RATIO, 0), (RATIO, 1) 

2 2 1/2 

and let MAXDIST = [RATIO + 

(b) For each node ieN, randomly assign node rank 
RANK( i) . 

(c) Let MAXPOP = min{ lOOxNNOD, 10^} be the maximum 
node population. 

(d) For each ieN, randomly assign node population 
POP(I) using normal distribution having mean 
MAXPOP/RANK ( i) and standard deviation 
0.1xMAXPOP/RANK( i) , but truncated below 1. 



(3) 



Distribute total supply over all 


(a) 


Let 


TOTPOP = 2 ^ POP(i) 






ieNj^ 


(b) 


For 


each ieNj_, let ISUP(i) = 


(c) 


For 


each ieN-N, , let ISUP(i) 



source nodes. 



ITSUPxPOP (i) /TOTPOP 
= 0 . 



(4) 



For each node ieN,, write out source node information 
in SHARE format, i and ISUP(i). 



(5) Determine the number of arcs to be created between 
echelons . 



(a) 


If ITYPE=2, let NARCd 


) = NARC and go to ( ) . 




NECHELN-1 




(b) 


Let NTARC = 

e=l 


LECHELN (e) xLECHELN (e+1 ) 


(c) 


For e=l to NECHELN-1, 


let NARC(e) = 



NARCx LECHELN ( e) X LECHELN ( e+1 ) /NTARC . 
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( 6 ) e=l 






1/2 



(7) Let ISQSUP = Lu ISUP(i) 

ieN^ 

e 

(8) For each node 

(a) Let OD = NARC ( e) x ISUP ( i) /I SQSUP be the outdegree 
of node i . 

(b) Randomly choose from N a set of OD tail nodes 
Tj^ for arcs emanating from i. 

(c) For each node jeT^ 

(i) Let DIST = [ (X (i) -X (j) ) ^+Y (i) -Y (j) ) be the 

distance from i to j . 

(ii) Let FLOW(j) = POP(i) xPOP(j) /DIST = proportion of 
total flow from i going to j . 

(iii) Let COST(j) = max{MINSCT, [MAXCSTx (DIST/MAXDIST) } 

be the cost assigned to arc i,j. 

(d) Let TOTFLOW = FLOW(j) . 

j ST. 

(e) For each node jeT^^, 

(i) Let lASSGN = FLOW ( j ) xISUP ( i) /TOTFLOW be the 
amount of flow to be assigned to node j from 
node i . 

(ii) Let ISUP(j) = ISUP ( j) +IASSGN be the current 
total amount of flow assigned to node j . 

(iii) If ICAP=0, CAP=ITSUP, else CAP=I ASSGNxCAPMUL . 

( iv) Write out arc information in SHARE format, 
i, j, COST(j), and CAP, 



(9) If e < NECHELN-1, let e=e+l and go to (7) 



(10) For each node jeN , , write out demand node information 
in SHARE format, 3 and ISUP(j). 



End of VSGEN 
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E. RESULTS 

Several types of structured networks were compared to 
NETGEN networks of the same size. Variations in ratio of 
supply and demand nodes, assignment of location attributes, 
number of nodes and arcs, and number of echelons were 
included in the evaluations. The ratio of supply to demand 
nodes ranged from severely skewed (few sources, many sinks) 
to equal numbers of sources and sinks. In no case were 
there more sources than sinks. The location attributes were 
assigned in two ways. One method assigned locations 
randomly inside a rectangle, similar to the methodology of 
TRANS. The second method assigned location according to 
node echelon, simulating the geographic structure described 
in Section II-C. The number of nodes and arcs ranged from 
400 to 2,000 and 5,000 to 15,000, respectively. Two, three, 
and four echelon networks were evaluated. For each 
capacitated VSGEN network, a range of values for the 
capacity multiplier was tested and the values recorded in 
Table III reflect those versions which resulted in the 
fastest computation times. The capacity multiplier values 
ranged from 7.5 to 50.0. Those these may seem like large 
capacities, these values are small when compared to the 
capacities allowed on the NETGEN arcs which ranged between 
one one-hundredth and one-tenth of total supply. 
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(a) TRANS "pseudo-structure" utilized for arc costs (nodes distributed in one large rectangle) 

(b) Nodes assigned according to geographic echelons 

(c) This size of network not generated with geographic echelon structure 

(d) NETGEN network produced was infeasible 



VSGEN revealed several interesting trends. The more 
efficient methodology used to produce random numbers 
achieved impressive reductions in the time to generate the 
test networks, as much as 78% less time to generate VSGEN 
problems than comparably sized NETGEN problems. As it 
should, the amount of time required by VSGEN appears to be 
directly proportional to the number of arcs requested. 

Table III shows the generation times for VSGEN and for 
NETGEN problems of comparable size. The "Node-Echelon 
Distribution" column in that table designates the number of 
nodes assigned to each echelon. The first number represents 
the number of nodes in the first echelon (sources) , the last 
number represents the number of nodes in the final echelon 
(sinks) and any numbers in between represent the interior 
echelons (pure transshipment nodes) . For example, the entry 
10x40x250 indicates 10 sources, 40 transshipment nodes, and 
250 sinks. 

The time required to solve the structured networks was 
also consistently less than the random NETGEN networks. The 
reductions ranged from 1% to 59%; the mean was a 26% 
reduction in solution time. Comparisons between structured 
networks indicated that the ratio of sources to sinks also 
affects solution time. Consistently shorter times were 
evident for skewed networks, i.e., those with more sinks 
than sources. In fact, there seons to be a direct 
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relationship between the ratio of sources to sinks and 
average solution time. As the ratio decreases, the solution 
time also does. This relationship holds true for two, 
three, and four echelon networks. In the multi-echelon 
networks, there is no detectable difference between solution 
times for three or four echelon problems of approximately 
the same skewness. However, as with the transportation 
problems, the solution times were shorter for networks with 
fewer sources than transshipment nodes and sinks than for 
networks with an approximately equal number of nodes in each 
echelon. These comparisons are evident in Table III. 

The results obtained also indicated that one of the most 
sensitive factors in determining solution time is arc flow 
capacity. Extremely tightly capacitated problems, those 
with a capacity multiplier close to one, can use as much as 
five times the amount of CPU time as the same network with 
uncapacitated arcs. In no case did a tightly capacitated 
problem solve more quickly than one with loose constraints. 
Figure 1 shows a sampling of capacity versus time 
relationships from the networks solved. 
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EFFECT OF CAPACITY CONSTRAINTS 

ON NETWORK SOLUTION TIME 
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CAPACITY MULTIPLIER 

Figure 1. Effect of Capacity Constraints on Network Solution Time 



IV. 



AN EXPERIMENT ON BIG M USING VSGEN 



A. ANALYTICAL DEVELOPMENT 

This chapter presents a use of VSGEN as a vehicle for 
comparing solution times in structured test networks and 
random test networks when Big M is experimentally allowed to 
vary. As discussed in the Introduction, small values of Big 
M in the primal simplex method may reduce the solution times 
of minimum cost network flow problems. Examining the 
effects of Big M presents an excellent opportunity for 
comparing structured and random networks. VSGEN and NETGEN 
are used in this chapter to generate test networks upon 
which the effects of varying Big M may be evaluated. A 
sampling of the "pseudo-structured" networks of Chapter II 
is also included in the evaluation. 

Before generating the test networks, it is prudent to 
analytically examine Big M. Reductions in solution times 
resulting from substantially reducing the value for Big M 
have already been claimed by Gregoriadis [Ref. 7]. Too 
large a value for Big M can cause numerical difficulties. 

It is desirable, therefore, to find a value for Big M, M^, 
which is as small as possible, yet which allows a feasible 
solution to be found if one exists. In this section, a 
bound on Big M in bipartite networks is derived which is not 
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computationally burdensome and this value is compared to the 
default value used in GNET. In the next section, the bound 
is also compared to the optimal value of Big M obtained by 
solving the minimum cost flow problem. 

An m^ X m2 bipartite network is a network with a set of 
m^ source nodes S, and a set of m2 sink nodes T, such that 
m = m^^ + m2- Furthermore, all arcs are of the form (i,j) 
where i e S and j e t. Transportation networks and 
assignment networks are examples of commonly occurring 
bipartite networks. Bipartite networks offer a relatively 
simple structure upon which to base initial calculations for 
the bound on Big M. Consequently, the bound developed in 
this section is directly applicable to bipartite networks 
only, but similar developments might extend the bound to 
more general networks. 

Given that Big M must be an upper bound on the dual 
variables, and the duals represent the marginal cost for a 
change in flow to a given node, one can logically evaluate a 
worst case change in flow in a given network, i.e., the 
largest possible value for a dual variable. To determine 
the cost for an increase in demand of one unit at demand 
node j, one needs to understand the chaining effect that the 
increased demand might cause. For this development, a p x p 
bipartite network is assumed with p = m^ = m2, and it is 
further assumed that c— ^ 0 for all arcs (i,j) . If demand 
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at node is increased by one unit, this extra unit might 
be supplied directly from supply node i^^’ along arc (i^/jj_). 
However, this may cause the reduction in flow by one unit 
along arc (ij^,j2)/ J2 ^ ^l' which in turns results in a 
deficit of one unit of flow at node J2 which must be 
supplied from some node I2 ¥ ij^* This chaining effect may 
continue along a chain of arcs ( i , ( i , j2) . • . ( ij^ / j 
resulting in a net increase in cost of 
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on the marginal cost associated with a 
nodes i^^, i2/ ..•/ i^/ in that order, is 
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Since we are concerned with the worst case, an upper bound 
on the cost associated with any chain using h sink nodes is 
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of an increase of one unit of flow at any sink node. For 
computational reasons, we use an upper bound on C derived 
as follows : 
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If a feasible solution to a bipartite network flow 
problem exists, then defined as above insures that the 
feasible solution will be found. will be smaller than 

the default value for Big M used by GNET, m x max c--, since 

(i/j) ^ 

M will always be less than or equal to l/2m x max c. . . 

^ (i,j) 

B. RESULTS 

The sharper bound on Big M, , was recorded for all 
random bipartite networks generated by NETGEN. Rarely was 
this estimate significantly more than a 50% reduction from 
the default value used in GNET. Similar results were 
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observed for the TRANS networks. For the structured 

bipartite networks generated by VSGEN, the bound value was 

as much as 70% reduction frorm GNET's default value. 

However, in large networks even a 70% reduction translates 

into a value which is quite large compared to maximum 

absolute cost. The values for Big M which Gregoriadis 

reports are necessary to reduce solution times are on the 

order of 1.5 x max c^j to 5.0 x max c^j . Sharper results 

might be obtained for an m^^ x m 2 bipartite network where 

m^^ < m 2 , since the maximum length of the chains used in the 

derivation of M would be 2m. - 1. Even those bounds would 

u 1 

be several orders of magnitude greater than the values 
necessary, and so the bound was not fully developed for such 
networks. The sharper bound here does not appear to be 
useful for reducing computation times, but might be helpful 
in avoiding numerical problems associated with handling 
large integer values on a computer. 

Although the bound, , was not sharp enough to compare 
with Gregoriadis' claims, the maximum dual variable obtained 
using GNET was recorded for the test networks and used as 
Big M in an attempt to validate those claims. Big M was 
incrementally increased from this starting point until no 
further reduction in solution time was evident. 

For all networks, regardless of structure, the 
reductions in solution times were insignificant over the 
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entire range of values of Big M for which feasibility was 
maintained, contrary to the Gregor iad is' s observations. As 
the value for Big M increased slowly from its minimum, the 
solution time quickly increased to equal the time of the 
solution which utilized GNET's default value. 

For the TRANS and NETGEN networks, using the maximum 
dual for Big M resulted in a reduction in solution times of 
only 1% to 9% over the solution time achieved using the 
default value of Big M. The results of reducing Big M in 
solving networks created by VSGEN were only slightly better 
(faster). In all cases, the reductions in solution time 
were less than 15% and the mean maximum reduction was 10%. 
Figure 2 illustrates the effects of varying Big M in solving 
random and pseudo- structured networks. Figure 3 illustrates 
the results for the structured networks generated by VSGEN. 
In both figures = a x max j . 
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EFFECT OF BIG M ON PSFUDO-STF^UCTURED NETWORKS 
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EFFECT OF DIG M ON MULTIECHELON NETWORKS 
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Figure 3. Effect of Big M on Multiechelon Networks 



V. SUMMARY AND DIRECTIONS FOR FURTHER RESEARCH 



A. SUMMARY 

1 . Methodology 

The research presented here was initiated with the 
intent of determining the effect of structure on the time 
required to solve minimum cost network flow problems in 
relation to the solution time required for random networks 
of the same type and size. The type and complexity of 
structure required to cause changes in solution times were 
to be explored, as well as the type of structure to which 
solution time was most sensitive. To perform the analysis 
it was necessary to construct the framework for a structured 
network generator that was efficient and easily expandable 
to various structural specifications. Further, a sharper 
bound on Big M was desired in order to examine claims that 
lower values of Big M reduced network solution time 
signi f icantly . 

The framework for a structured network generator 
has been successfully created in VSGEN. The program forms 
feasible, structured transportation and multi-echelon test 
networks quickly and reliably. It does not yet have the 
capability to produce assignment or general transshipment 
problens. At its present stage of development, VSGEN allows 
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the user to apply structure to the arc flow costs, arc flow 
capacities, and node-echelon distribution. Further, the 
user controls total supply, total number of nodes, total 
number of nodes in each echelon of a multi-echelon problem, 
tightness of arc capacities, and maximum unit flow cost. 

The costs are structured in one of two ways. The first 
method assigns node locations randomly within a rectangle. 
The second method assigns nodes to sections of the rectangle 
in direct relation to the echelon number to which a node is 
assigned. In each case the cost for flow is proportional to 
the square root of the Euclidean distance between nodes. 

The amount of flow assigned is determined based on node 
attributes including location and node population. The 
population of each node is determined by application of 
Zipf's Law. 

To test the hypothesis that structured networks 
solve more quickly than random networks generated by NETGEN, 
a wide variety of problems were solved by GNET, an efficient 
primal simplex code for minimum cost network flow problems. 
The mean of five solution times for each network was used in 
the time comparisons. The network parameters that were 
varied were total number of nodes and arcs, ratio of sources 
to sinks, node-echelon distribution, total supply, and arc 
capacity. 
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The sensitivity of network solution time to the 
parameter Big M was also tested by initially setting that 
parameter equal to the maximum dual variable at optimality. 
Big M was incremented upward from that value until no 
further change in solution time was evident. In addition, a 
bound on Big M sharper than the default value used in GNET, 
m X max , was developed. 

2 . Findings 

VSGEN produces test networks faster than NETGEN. In 
some cases, generation of networks by NETGEN requires more 
than three times the amount of computer time required by 
VSGEN for comparable networks. Further, VSGEN consistently 
produces feasible networks when they are requested in 
contrast to NETGEN, which occasionally generates unrequested 
infeasible networks (negative demands) . These infeasi- 
bilities can be avoided at times by changing the random 
number seed in NETGEN. A more reliable method for insuring 
feasibility is to input a total supply several orders of 
magnitude greater than the number of nodes. 

VSGEN’ s only apparent difficulty is in generating 
exactly the requested number of arcs. Some instances 
require the user to inflate the input to obtain the desired 
number of arcs. The cause of this is a combination of the 
method of arc-to-node allocation and integer truncation. 
NETGEN does not exhibit this error and in most cases 
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produces a small percentage greater than the number of arcs 
requested , 

The structured networks are consistently solved more 
quickly than random networks when capacities are not too 
restrictive. However, different types of structure affect 
solution times to varying degrees. The cost structure used 
in the TRANS program in Chapter ll, defined as being in 
direct proportion to the distance between nodes, does not 
appear to have any significant effect on solution times. 
Structured transportation networks generated by VSGEN with 
an equal number of sources and sinks solve faster than 
random networks of the same size. This indicates that the 
combination of the different cost structure and methodology 
which assigns flow in VSGEN does result in reduced solution 
times. The individual contributions of cost and flow 
structure have not been determined. The most influential 
factors are node-echelon distribution and arc capacity. A 
severely skewed node-echelon distribution produces much 
shorter solution times than networks with equal numbers of 
nodes between the echelons. The addition of one or two more 
echelons to a skewed network did not reveal any significant 
change. Likewise, changing from random placement of nodes 
to geographic echelon structure in the multi-echelon 
problems did not result in reduced solution times. 
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The most sensitive parameter affecting solution time 
is arc capacity. Tightly capacitated networks may require 
more than five times as much time to solve as uncapacitated 
networks. Some capacitated problems do solve more quickly 
than the same networks uncapacitated, at certain levels of 
capacitation . 

In contrast to the information presented by 
Gregoriadis, Big M did not significantly reduce network 
solution time. The largest reductions achieved averaged 
approximately 10%. Additionally, the sharper bound on Big M 
analytically determined in Chapter IV is not of any 
practical significance at the present time. The bound 
achieves gains between 50% and 70% over GNET's default 
value, which might be useful in avoiding numerical 
difficulties with large integers in computer storage, but 
this bound in still several orders of magnitude greater than 
the values necessary to influence solution times. 

B. DIRECTIONS FOR FURTHER RESEARCH 

VSGEN should be expanded to allow user selection of a 
wider variety of structures. Subroutines to handle 
assignment problems and other structures encountered in 
network research should be developed. Suggestions for 
additional subroutines include the following: 

(1) subroutines which would allow the user to generate 
assignment and general transshipment problems. 
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(2) a structured distribution for determining which 
nodes may communicate or be connected. Although random 
selection of sink nodes is simple and efficient/ it may be 
desirable to assign sinks to sources based on the relative 
location of the head and tail nodes and other node 
attributes as well. As an intermediate step in achieving 
such structure, one may wish to simply assign nodes to 
classes and allow communication between only specified 
classes. Assignment problems would be created effectively 
this way. 

(3) a subroutine which would assign location by regian 
instead of echelon; that is, divide the rectangular area in 
which the nodes have been placed, horizontally as well as 
vertically. Region could become an attribute on which to 
base the distribution discussed in (2). This would allow 
creation of general transshipment problems which are not 
bipartite . 

(4) a subroutine that would allow replication of a 
network over multiple time periods and create inter- 
connections between these "temporal echelons" depending on 
the time required to travel between nodes. 

(5) a subroutine that would allow the user to define 
the function of Euclidean distance from which unit flow cost 
is determined. 

Other portions of this thesis warrant further effort, 
also. Some simple modifications are required to improve 
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NETGEN and VSGEN . The procedure for assigning outdegree in 
VSGEN should be refined so that the user may be assured of 
producing networks with the requested number of arcs. The 
random number generator in NETGEN should be replaced by 
updated versions to examine the possibility that the old 
random number subroutine is the cause of NETGEN' s slower 
generation times. In any case, NETGEN' s procedure should be 
revised to reflect the more efficient method of generating 
random numbers in groups rather than one at a time. 

Finally, and most significantly, NETGEN' s method of 
allocating supply needs to be changed to insure generation 
of feasible networks. The infeasible problems produced 
without warning result in wasted time and effort. 

The most important direction for further research is to 
broaden the base of comparison developed in this thesis. It 
is clear that structure affects solution time, but the 
contributions of the various aspects of structure are yet 
unclear. Future study should compare networks with isolated 
types of structure to reveal individual contributions. 
Different topologies, such as those found in inventory 
problens, should be explored. Although substantial 
demonstrations of the difference between structured and 
random networks have been presented, statistical 
significance in this research can be achieved only through 
extensive effort in varying network structure. 
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